First, do some initialization and set debugging level to debug
to see progress of computation.
%matplotlib inline
%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = 'svg'
import numpy as np
import pandas as pd
import seaborn as sns
import datetime as dt
import matplotlib.pyplot as plt
import universal as up
from universal import tools, algos
from universal.algos import *
sns.set_context("notebook")
plt.rcParams["figure.figsize"] = (16, 8)
# ignore logged warnings
import logging
logging.getLogger().setLevel(logging.ERROR)
Let's try to replicate the results of B.Li and S.Hoi from their article On-Line Portfolio Selection with Moving Average Reversion. They claim superior performance on several datasets using their OLMAR algorithm. These datasets are available in data/
directory in .csv
format. Those are all raw prices and artificial tickers. We can start with NYSE stocks from period 1/1/1985 - 30/6/2010.
# load data using tools module
data = tools.dataset('nyse_o')
# plot first three of them as example
data.iloc[:,:3].plot()
<AxesSubplot:>
Now we need an implementation of the OLMAR algorithm. Fortunately, it is already implemented in module algos
, so all we have to do is load it and set its parameters. Authors recommend lookback window $w = 5$ and threshold $\epsilon = 10$ (these are default parameters anyway). Just call run
method on our data to get results for analysis.
# set algo parameters
algo = algos.OLMAR(window=5, eps=10)
# run
result = algo.run(data)
Ok, let's see some results. First print some basic summary metrics and plot portfolio equity with UCRP (uniform constant rebalanced portfolio).
print(result.summary())
result.plot(weights=False, assets=False, ucrp=True, logy=True);
Summary: Profit factor: 1.89 Sharpe ratio: 3.34 ± 0.54 Ulcer index: 23.19 Information ratio (wrt UCRP): 3.27 Appraisal ratio (wrt UCRP): 3.05 ± 0.21 UCRP sharpe: 1.16 ± 0.27 Beta / Alpha: 1.55 / 165.350% Annualized return: 466.14% Annualized volatility: 56.69% Longest drawdown: 185 days Max drawdown: 46.29% Winning days: 58.3% Annual turnover: 349.1
That seems really impressive, in fact it looks too good to be true. Let's see how individual stocks contribute to portfolio equity and disable legend to keep the graph clean.
result.plot_decomposition(legend=False, logy=True)
<AxesSubplot:>
As you can see, almost all wealth comes from single stock (don't forget it has logarithm scale!). So if we used just 5 of all these stocks, we would get almost the same equity as if we used all of them. To stress test the strategy, we can remove that stock and rerun the algorithm.
# find name of the most profitable asset
most_profitable = result.equity_decomposed.iloc[-1].idxmax()
# rerun an algorithm on data without it
result_without = algo.run(data.drop(columns=[most_profitable]))
# and print results
print(result_without.summary())
result_without.plot(weights=False, assets=False, ucrp=True, logy=True);
Summary: Profit factor: 1.55 Sharpe ratio: 2.48 ± 0.43 Ulcer index: 12.67 Information ratio (wrt UCRP): 2.36 Appraisal ratio (wrt UCRP): 2.14 ± 0.21 UCRP sharpe: 1.12 ± 0.27 Beta / Alpha: 1.48 / 96.840% Annualized return: 192.81% Annualized volatility: 47.90% Longest drawdown: 202 days Max drawdown: 45.91% Winning days: 56.5% Annual turnover: 346.1
We lost about 7 orders of wealth, but the results are more realistic now. Let's move on and try adding fees of 0.1% per transaction (we pay \$1 for every \$1000 of stocks bought or sold).
result_without.fee = 0.001
print(result_without.summary())
result_without.plot(weights=False, assets=False, ucrp=True, logy=True)
Summary: Profit factor: 1.35 Sharpe ratio: 1.79 ± 0.34 Ulcer index: 6.76 Information ratio (wrt UCRP): 1.61 Appraisal ratio (wrt UCRP): 1.41 ± 0.21 UCRP sharpe: 1.12 ± 0.27 Beta / Alpha: 1.47 / 62.608% Annualized return: 108.72% Annualized volatility: 47.17% Longest drawdown: 382 days Max drawdown: 50.09% Winning days: 50.2% Annual turnover: 346.1
[<AxesSubplot:ylabel='Total wealth'>]
Results still hold, although our Sharpe ratio decreased from 3.14 to 1.56 and annualized return from 466% to 109%. Now some of you trained in quantitative finance might start asking: "Isn't there some survivorship bias?". Yes, it is. In fact, a huge one considering that we have almost 25 years of data and mean-reversion type of strategy.
Let's see whether the algo works on recent data, too. First download closing prices of several (randomly chosen) stocks from Yahoo.
from pandas_datareader import data as pdr
import yfinance as yf
yf.pdr_override()
# load data from Yahoo
yahoo_data_raw = pdr.get_data_yahoo(['MSFT', 'IBM', 'AAPL', 'GOOG'], start=dt.datetime(2005,1,1))
yahoo_data = yahoo_data_raw['Adj Close']
# plot normalized prices of these stocks
(yahoo_data / yahoo_data.iloc[0,:]).plot()
<AxesSubplot:xlabel='Date'>
Instead of using fixed parameters, we will test several window
parameters with function run_combination
. It the same as run
, just use it as classmethod and use lists for combination of values. run_combination
returns list of results which can be used similarly to result
.
list_result = algos.OLMAR.run_combination(yahoo_data, window=[3,5,10,15], eps=10)
print(list_result.summary())
list_result.plot()
Summary for window=3: Profit factor: 1.09 Sharpe ratio: 0.61 ± 0.27 Ulcer index: 1.07 Information ratio (wrt UCRP): -0.13 Appraisal ratio (wrt UCRP): -0.20 ± 0.25 UCRP sharpe: 0.95 ± 0.30 Beta / Alpha: 1.07 / -4.173% Annualized return: 14.99% Annualized volatility: 30.87% Longest drawdown: 649 days Max drawdown: 51.24% Winning days: 51.4% Annual turnover: 284.1 Summary for window=5: Profit factor: 1.12 Sharpe ratio: 0.74 ± 0.29 Ulcer index: 1.55 Information ratio (wrt UCRP): 0.07 Appraisal ratio (wrt UCRP): -0.01 ± 0.25 UCRP sharpe: 0.95 ± 0.30 Beta / Alpha: 1.06 / -0.112% Annualized return: 19.64% Annualized volatility: 30.50% Longest drawdown: 806 days Max drawdown: 48.11% Winning days: 52.1% Annual turnover: 215.5 Summary for window=10: Profit factor: 1.15 Sharpe ratio: 0.88 ± 0.30 Ulcer index: 1.89 Information ratio (wrt UCRP): 0.27 Appraisal ratio (wrt UCRP): 0.21 ± 0.25 UCRP sharpe: 0.95 ± 0.30 Beta / Alpha: 1.03 / 4.371% Annualized return: 24.43% Annualized volatility: 29.94% Longest drawdown: 401 days Max drawdown: 53.78% Winning days: 52.0% Annual turnover: 153.3 Summary for window=15: Profit factor: 1.10 Sharpe ratio: 0.66 ± 0.28 Ulcer index: 1.28 Information ratio (wrt UCRP): -0.08 Appraisal ratio (wrt UCRP): -0.10 ± 0.25 UCRP sharpe: 0.95 ± 0.30 Beta / Alpha: 1.03 / -2.095% Annualized return: 16.60% Annualized volatility: 30.02% Longest drawdown: 627 days Max drawdown: 57.16% Winning days: 51.6% Annual turnover: 117.6
<AxesSubplot:xlabel='Date', ylabel='Total wealth'>
Since we don't know the best parameters in hindsight, we will invest equal money in each of them in the beginning and let them run. This is called buy and hold strategy. Portfolio equities in list_result
can be regarded as stock prices and used as an input for new algo (buy and hold in this case). This way you can chain algorithms however you like, for example OLMAR on OLMAR, etc.
To compare it with individual assets or uniform constant rebalanced portfolio, use parameters assets
and ucrp
.
# run buy and hold on OLMAR results and show its equity together with original assets
algos.BAH().run(list_result).plot(assets=True, weights=False, ucrp=True);
Ok, so that was enough for the start. There are plenty of other algorithms in module algos
collected across research papers about online-portfolios including famous Universal portfolio by Thomas Cover.
Entire package is actually pretty simple. Algorithms are subclasses of base Algo
class and methods for reporting, plotting and analysing are built on top of this class. I will illustrate it on this mean-reversion strategy
The idea is that badly performing stocks will revert to its mean and have higher returns than those above their mean. Here is the complete code, comments should be self-explanatory.
from universal.algo import Algo
import numpy as np
class MeanReversion(Algo):
# use logarithm of prices
PRICE_TYPE = 'log'
def __init__(self, n):
# length of moving average
self.n = n
# step function will be called after min_history days
super().__init__(min_history=n)
def init_weights(self, cols):
# use zero weights for start
return pd.Series(np.zeros(len(cols)), cols)
def step(self, x, last_b, history):
# calculate moving average
ma = history.iloc[-self.n:].mean()
# weights
delta = x - ma
w = np.maximum(-delta, 0.)
# normalize so that they sum to 1
return w / sum(w)
That's all. Now let's try it on nyse data.
mr = MeanReversion(n=20)
result = mr.run(data)
print(result.summary())
result.plot(assets=False, logy=True, weights=False, ucrp=True);
Summary: Profit factor: 1.46 Sharpe ratio: 2.13 ± 0.38 Ulcer index: 4.84 Information ratio (wrt UCRP): 1.94 Appraisal ratio (wrt UCRP): 1.67 ± 0.21 UCRP sharpe: 1.16 ± 0.27 Beta / Alpha: 1.15 / 32.157% Annualized return: 60.46% Annualized volatility: 23.47% Longest drawdown: 808 days Max drawdown: 41.29% Winning days: 52.4% Annual turnover: 365.7
Not bad considering how simple that strategy is. Next step could be performance optimization. To profile your strategy, you can use function profile
in universal.tools
which profile the code using fantastic line_profiler. After identifying the most critical parts of the code, you have two options. Either optimize your step
function (using tools such as weave, numba, theano or cython) or subclass weights
method if your code could be vectorized easily (beware the forward bias!).